This documentation outlines the steps for building a dataset of indicators that draw from multiple sources, with the purpose of monitoring patient outcomes across several domains, including Cancer, Frailty and End of Life, Cardiovascular Health, Mental Health, Respiratory Health, Mortality, Children and Young People, Prevention, and Other.
Additionally, this dataset will support monitoring variations across geographic levels such as PCN, ICB, Ward, Locality (resident), and Local Authority District, as well as demographic groups. This will allow us to assess inequalities by ethnicity and deprivation. There are approximately 120 indicators across 7 domains, and each indicator has its own requirements for dataset construction. This documentation will guide you through the steps involved in building an indicator, based on the conditions specified in the metadata file, which can be found here.
To enable the calculation of rates, percentages, ratios, life expectancy, and other outcome measures across various geographic levels and demographic groups, the dataset includes the following key components:
Indicator metadata: IDs, start and end dates of the indicator reporting period.
Numerator: The count of events or the subset of the total population who experienced the outcome of interest.
Denominator: Total population against which the numerator is measured.
Indicator value: Calculated measures such as crude rates, directly age standardised rates (DASR), percentages, ratios, life expectancy, and other value types.
Confidence intervals: Lower and upper 95% confidence bounds for the indicator value.
Demographic dimensions: Breakdown by IMD quintile, age group, sex, and ethnicity.
Geographic dimensions: Breakdown by PCN, ICB for GP-level data, and Ward, Locality (resident), and Local Authority District (LAD) for resident population data.
Additional metadata: For example, creation_date (when the indicator was created), source_code (the source from which the indicator was derived), and value_type_code (the type of measure).
The dataset is structured using a Fact–Dimension model, where reference (dimension) tables are linked to the central fact table that stores the indicator values.
For an example of the underlying data structure, you can explore the following table:
SELECT TOP 1000*FROM [EAT_Reporting_BSOL].[OF].[OF2_Indicator_SQL_Data]
2 Start Here
Begin by identifying the population type for the indicator in the metadata.
2.1 A) Population type = Census
If the population type is Census, the population (denominator) for this indicator must be derived from Census population estimates to calculate rates or other value types.
Examples: indicators from the SUS and Mortality datasets. These indicators do not have predefined denominators and use Census estimates to derive the population.
2.2 B) Population type = GP registered and a denominator column exists
If the population type is GP registered, identify whether the data source has a denominator column. For example, QOF, CVD Prevent, and Fingertips provide a denominator column in their datasets.
2.3 C) Population type ≠ Census and no denominator column exists
If the population type is not Census and there is no denominator column in the data source, proceed to Chapter 3: Building Indicators with Derived Denominators.This chapter is designed for building indicators where the data source is at the patient-level, allowing us to derive the numerator from the dataset. In addition, the denominator can also be derived from the same data source. For example, the NDA dataset.
All of these datasets will feed into one of the following destination tables, depending on the data source:
[EAT_Reporting_BSOL].[OF].[OF2_Indicator_SQL_Data]: data stored in the warehouse, such as SUS, QOF, NDA, and Mortality.
[EAT_Reporting_BSOL].[OF].[OF2_Indicator_API_Data]: data extracted via API, such as Fingertips and CVD Prevent.
[EAT_Reporting_BSOL].[OF].[OF2_Indicator_Sharepoint_Data]: data saved in the Sharepoint folder, received from external providers or those indicators that are pre-calculated
[EAT_Reporting_BSOL].[OF].[OF2_Indicator_Other_Data]: data sourced from locations other than those mentioned above.
These tables are then unioned together into a single staging table for processing and value calculation, which is handled separately from this step.
2.4 Flowchart
Show code
flowchart TBA((Start))B{"Population Type = Census?"}C{"Denominator column in data?"}L1(["Chapter 2: Building Indicators without Pre-defined Denominator"])L2(["Example(s): SUS, Mortality"])L3(["Insert data into destination table with denominator = NULL "])L4(["OF dataset"])M1(["Chapter 1: Building Indicators with Pre-defined Denominator"])M2(["Example(s): QOF, CVD Prevent, Fingertips"])M3(["Insert data into destination table with denominator populated"])R1(["Chapter 3: Building Indicators with Derived Denominator"])R2(["Example(s): NDA"])R3(["Insert data into destination table with denominator populated"])A --> BB -- "Yes" --> L1B -- "No" --> CC -- "Yes" --> M1C -- "No" --> R1R1 --> R2R2 --> R3R3 --> L4L1 --> L2L2 --> L3L3 --> L4M1 --> M2M2 --> M3M3 --> L4classDef start fill:#A5D6A7,stroke:#2E7D32,stroke-width:1px,color:#000classDef decision1 fill:#D1C4E9,stroke:#6A1B9A,stroke-width:1px,color:#000classDef decision2 fill:#EF9A9A,stroke:#C62828,stroke-width:1px,color:#000classDef step fill:#BBDEFB,stroke:#1E88E5,stroke-width:1px,color:#000classDef orange fill:#FFD180,stroke:#F57C00,stroke-width:1px,color:#000classDef green fill:#C8E6C9,stroke:#388E3C,stroke-width:1px,color:#000classDef grey fill:#E0E0E0,stroke:#616161,stroke-width:1px,color:#000class A startclass B decision1class C decision2class L1,M1,R1 stepclass L2,M2,R2 orangeclass L3,M3,R3 greenclass L4 grey
flowchart TB
A((Start))
B{"Population Type = Census?"}
C{"Denominator column in data?"}
L1(["Chapter 2: Building Indicators without Pre-defined Denominator"])
L2(["Example(s): SUS, Mortality"])
L3(["Insert data into destination table with denominator = NULL "])
L4(["OF dataset"])
M1(["Chapter 1: Building Indicators with Pre-defined Denominator"])
M2(["Example(s): QOF, CVD Prevent, Fingertips"])
M3(["Insert data into destination table with denominator populated"])
R1(["Chapter 3: Building Indicators with Derived Denominator"])
R2(["Example(s): NDA"])
R3(["Insert data into destination table with denominator populated"])
A --> B
B -- "Yes" --> L1
B -- "No" --> C
C -- "Yes" --> M1
C -- "No" --> R1
R1 --> R2
R2 --> R3
R3 --> L4
L1 --> L2
L2 --> L3
L3 --> L4
M1 --> M2
M2 --> M3
M3 --> L4
classDef start fill:#A5D6A7,stroke:#2E7D32,stroke-width:1px,color:#000
classDef decision1 fill:#D1C4E9,stroke:#6A1B9A,stroke-width:1px,color:#000
classDef decision2 fill:#EF9A9A,stroke:#C62828,stroke-width:1px,color:#000
classDef step fill:#BBDEFB,stroke:#1E88E5,stroke-width:1px,color:#000
classDef orange fill:#FFD180,stroke:#F57C00,stroke-width:1px,color:#000
classDef green fill:#C8E6C9,stroke:#388E3C,stroke-width:1px,color:#000
classDef grey fill:#E0E0E0,stroke:#616161,stroke-width:1px,color:#000
class A start
class B decision1
class C decision2
class L1,M1,R1 step
class L2,M2,R2 orange
class L3,M3,R3 green
class L4 grey
3 Chapter 1: Building Indicators (with Pre-Defined Denominator)
The following steps will guide you through building OF indicators with pre-defined denominators - i.e., when the data source already includes a denominator column. Examples include indicators from QOF, CVD Prevent, and Fingertips.
First, create a new SQL script per indicator and name the file as follows:
ID_{indicator_id}_{indicator_short_name}.sql
Save the script in the corresponding data source folder within the Outcome Framework Rebuild path:
We begin by creating two staging tables to store data at different steps:
One table stores the processed indicator data, which will be used to update the final OF dataset;
The other stores data containing all required columns, enabling us to generate the processed indicator data for the final OF dataset.
3.1.1 Staging table for processed indicator data
This table stores the processed indicator data for each required geography level and demographic split. The data will then be used to update the final indicator data table.
It must therefore include all columns defined in the specified data structure of the destination table (i.e., [EAT_Reporting_BSOL].[OF].[OF2_Indicator_SQL_Data]).
Show example
DROPTABLEIFEXISTS #staging_QOF_data;CREATETABLE #staging_QOF_data( indicator_id INT, start_date DATE, end_date DATE, numerator FLOAT, denominator FLOAT, indicator_value FLOAT, lower_ci95 FLOAT, upper_ci95 FLOAT, imd_code INT, aggregation_id VARCHAR(20), -- PCN/ICB codes are in VARCHAR, will convert to INT later age_group_code INT, sex_code INT, ethnicity_code INT, creation_date DATE, value_type_code INT, source_code INT);
3.1.2 Temporary table for GP-level raw data
Indicators with a pre-defined denominator are at the GP level. Therefore, we need a temporary table to store data at this level, which can then be used to aggregate data to higher levels (i.e., PCNs and ICB).
Create this temporary table to store the lowest-level data—at the GP level—which will subsequently be used to aggregate data to higher levels such as PCN and ICB.
It must include the columns required to build the indicator data at the chosen geography level and by demographic split, so that the results can be stored in the staging table created in Step 3.1.1.
This table also requires additional columns, such as GP practice codes and their corresponding PCN codes, to enable grouping at higher levels.
These are the columns required for the GP-level data:
indicator_id
start_date and end_date
numerator
denominator
indicator_value
lower_ci9 and upper_ci95
imd, age_group, sex, and ethnicity
creation_date
value_type
source
GP
PCN
3.2 Build GP-level data
From this point onwards, the steps will outline how to build GP-level data in the table created in Step 3.1.2, which will be used to aggregate data to higher levels (i.e., PCN and ICB).
If the data source does not contain GP-level data but instead provides higher-level data such as PCN or ICB, you can continue building the dataset at the available level (e.g., PCN). The required columns remain the same; the only difference is that the lowest geography level will be PCN, and the data will then be aggregated to the ICB level.
Similarly, if the data source only provides ICB-level data, you can continue building the dataset as normal with all the required columns, without the need for aggregation since ICB is already the highest level.
3.2.1 Identify the indicator’s year type
Identify the year type of the indicator in the metadata file.
If the year type is either Financial year or Calendar year, use the following reference table and go to Step 3.2.1.1 to update the start_date and end_date of the indicator data.
Show code
SELECT TOP 1000*FROM [Reference].[dbo].[DIM_tbDate]
Otherwise, update the start_dateand end_date based on how the time period in the data source is structured.
3.2.1.1 Get Start and End Dates
The temporary GP-level data must have the start_date and end_date columns, which correspond to the start and end of the reporting period for that indicator.
You can use the Date reference table above to obtain the start and end dates by joining on the matching data column.
Show example
SELECT T2.[HCSStartOfYearDate] AS start_date ,T2.[HCSEndOfYearDate] AS end_dateINTO #gp_dataFROM [AnalystGlobal].[Performance].[QOFIndicatorsAndPrevalence] AS T1INNERJOIN [Reference].[dbo].[DIM_tbDate] AS T2ON T1.FinancialYear = T2.HCSFinancialYearNameGROUPBY T2.[HCSStartOfYearDate] ,T2.[HCSEndOfYearDate]
Note: Be mindful of duplicate dates that may result from joining the two tables. Ensure that the start and end dates are unique for each year type. For example, 2024-04-01 and 2025-03-31 are the start and end dates, respectively, for the 2024/25 financial year.
3.2.2 Assign indicator ID
Identify the indicator ID from the metadata file, and assign the ID to the respective column indicator_id in the temporary GP-level data.
Show example
UPDATE T1SET indicator_id =1FROM #gp_data AS T1
3.2.3 Extract numerator
Extract counts based on the numerator definition for that indicator in the metadata file at the GP level, including IMD and/or Ethnicity splits if available. This will form the numerator column in the temporary GP data.
Show example
UPDATE T1SET T1.numerator = T2.[DiseaseRegisterSize]FROM #gp_data AS T1INNERJOIN [AnalystGlobal].[Performance].[QOFIndicatorsAndPrevalence] AS T2ON T1.PracticeCode = T2.PracticeCodeINNERJOIN EAT_Reporting_BSOL.[Reference].[BSOL_ICS_PracticeMapped] AS T3ON T2.PracticeCode = T3.GPPracticeCode_OriginalWHERE T3.ICS_2223 ='BSOL'AND T2.IndicatorCode ='STIA001'AND T2.DiseaseRegisterSize ISNOTNULLAND T2.PracticeCode <>'M88006';
3.2.4 Extract denominator
Similarly, extract counts from the denominator column in the data source. This will form the denominator column in the temporary GP data.
Show example
UPDATE T1SET T1.denominator = T2.[PracticeListsize] FROM #gp_data AS T1INNERJOIN [AnalystGlobal].[Performance].[QOFIndicatorsAndPrevalence] AS T2ON T1.PracticeCode = T2.PracticeCodeINNERJOIN EAT_Reporting_BSOL.[Reference].[BSOL_ICS_PracticeMapped] AS T3ON T2.PracticeCode = T3.GPPracticeCode_OriginalWHERE T3.ICS_2223 ='BSOL'AND T2.IndicatorCode ='STIA001'AND T2.DiseaseRegisterSize ISNOTNULLAND T2.PracticeCode <>'M88006';
3.2.5 Update indicator value & CI95 bounds
Leave theindicator_value, lower_ci95 and upper_ci95 (confidence interval (CI) bounds) columns as NULL as they will be calculated in a later step that is separate from this process.
If the data source contains IMD splits, extract the values from the IMD column in the data source and reference them in the temporary GP-level data.
Once you have extracted the IMD column from the data source and inserted it into the GP-level data, you will need map each IMD quintile to its corresponding imd_code, which is a unique code from the OF reference table below:
Show code
SELECT TOP (1000) [imd_code] ,[imd_quintile] ,[imd_quintile_desc]FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_IMD]
Please note that if no IMD splits are available in the data source, assign the code '999' to the IMD column in the temporary GP-level data, which maps to 'All (Persons)' category.
Show example
UPDATE T1SET T1.imd =999FROM #gp_data AS T1
Additionally, if the data source contains IMD splits but some rows have missing quintiles, replace the NULL values in the data source with 'Unknown' category. These can then be mapped to code '-99' using the reference table above.
3.2.6.2 Ethnicity
If the data source contains ethnicity splits, extract the values from the ethnicity column in the data source and reference them in the temporary GP-level data.
Once you have extracted the ethnicity column from the data source and inserted it into the GP-level data, you will need map each ethnicity to its corresponding ethnicity code. Use the number column prefixed ethnicity_code that is relevant to your ethnicity breakdown.
For example:
Ethnicity breakdowns by NHS Code and Census ethnic group must use the ethnicity_code column.
Ethnicity breakdowns for the main 5 groupings and CVD prevent must use the ethnicity_code_main column.
Ethnicity breakdowns for OF groupings must use the ethnicity_code_OF column.
Please note that if no ethnicity splits are available in the data source, assign the code '999' to the ethnicity column in the temporary GP-level data, which maps to 'All (Persons)' category.
Additionally, if the data source contains ethnicity splits but some rows have missing or unknown ethnicity, replace the values with 'Unknown'.These can then be mapped to code '-99' using the reference table above.
3.2.6.3 Sex
Assign the code '999' to the sex column in the temporary GP-level data, which corresponds to 'All (Persons)' category as specified in the reference table below:
Show code
SELECT TOP (1000) [sex_code] ,[sex]FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Sex]
3.2.6.4 Age group
Assign the code '999' to the age_group column in the temporary GP-level data, which corresponds to 'All (Persons)' category as specified in the reference table below:
Update the indicator creation_date column with the current date.
Show example
SELECT getdate() AS CreationDateINTO #GP_data
3.2.6.6 Value type
The value type defines the type of calculation required for a particular indicator. It may include (in)directly age standardised rate, crude rate, percentage, ratio, life expectancy, count, or other.
Identify the value type of the indicator from the metadata file, and assign the corresponding code to the value_type column in the temporary GP-level data table using the reference table below:
Show code
SELECT TOP (1000) [value_type_code] ,[value_type]FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Value_Type]
3.2.6.7 Source
The source column defines where the indicator data is extracted from. Possible sources include data warehouse (SQL), SharePoint site, API, or other.
Assign code '1' to the source column in the temporary GP-level data table if the indicator data is extracted from our data warehouse.
Use the following reference table for further details:
Show code
SELECT TOP (1000) [source_code] ,[source]FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Source]
3.2.7 Add GP and PCN columns
To enable aggregation of data into higher levels, such as PCN and ICB, additional columns such as GP and PCN need to be added to the existing temporary GP-level data table.
Use the following reference table to get the GP and corresponding PCN codes:
Make sure to filter data by BSOL practices only, remove ‘Cape Hill Practice’ and Closed practices.
Show example
SELECTDISTINCT T2.[GPPracticeCode_Original] AS GP ,T2.[PCN code] AS PCNFROM [AnalystGlobal].[Performance].[QOFIndicatorsAndPrevalence] T1INNERJOIN EAT_Reporting_BSOL.Reference.BSOL_ICS_PracticeMapped T2ON T1.[PracticeCode] = T2.[GPPracticeCode_Original]WHERE T2.ICS_2223 ='BSOL'AND IndicatorCode ='STIA001'AND DiseaseRegisterSize ISNOTNULLAND PracticeCode <>'M88006'--Delete cape hill practiceAND PCN <>'Closed practice'
3.2.8 Putting GP-level data together
Here’s an example of building indicator data at the GP level using the QOF dataset.
Show code
/*================================================================================================= Indicator ID 1 - Reference ID 212 - Stroke: QOF prevalence (all ages)=================================================================================================*/DROPTABLEIFEXISTS #gp_data;SELECTDISTINCT1AS indicator_id ,T3.[HCSStartOfYearDate] AS start_date ,T3.[HCSEndOfYearDate] AS end_date ,T1.[DiseaseRegisterSize] AS numerator ,T1.[PracticeListsize] AS denominator ,CAST(NULLASFLOAT) AS indicator_value ,CAST(NULLASFLOAT) AS lower_ci95 ,CAST(NULLASFLOAT) AS upper_ci95 ,999AS imd_code -- All IMD ,T2.[GPPracticeCode_Original] AS gp_practice_code -- GP ,T2.[PCN code] AS pcn_code -- PCN ,999AS age_group_code -- All Ages ,999AS sex_code -- All Persons ,999AS ethnicity_code -- All ethnicity ,CAST(GETDATE() ASDATE) AS creation_date ,2AS value_type_code -- Percentage ,1AS source_code -- SQLINTO #gp_dataFROM [AnalystGlobal].[Performance].[QOFIndicatorsAndPrevalence] AS T1INNERJOIN EAT_Reporting_BSOL.[Reference].[BSOL_ICS_PracticeMapped] AS T2ON T1.[PracticeCode] = T2.[GPPracticeCode_Original]INNERJOIN [Reference].[dbo].[DIM_tbDate] AS T3ON T1.[FinancialYear] = T3.[HCSFinancialYearName]WHERE T2.[ICS_2223] ='BSOL'AND T1.[IndicatorCode] ='STIA001'AND T1.[DiseaseRegisterSize] ISNOTNULLAND T1.[PracticeCode] <>'M88006'; -- Exclude Cape Hill practice
3.3 Build PCN-level data
Use the temporary GP-level data and aggregate it to the PCN-level data by grouping on the required columns.
Then, insert the aggregated data into the staging processed indicator data table created in Step 3.1.1.
Show example
INSERTINTO #staging_QOF_data( indicator_id, start_date, end_date, numerator, denominator, indicator_value, lower_ci95, upper_ci95, imd_code, aggregation_id, age_group_code, sex_code, ethnicity_code, creation_date, value_type_code, source_code)SELECT indicator_id ,start_date ,end_date ,SUM(numerator) AS numerator ,SUM(denominator) AS denominator ,indicator_value ,lower_ci95 ,upper_ci95 ,imd_code -- All IMD ,pcn_code AS aggregation_id -- map to aggregation_id later ,age_group_code -- All ages ,sex_code -- All persons ,ethnicity_code -- All ethnicities ,creation_date ,value_type_code -- Percentage ,source_code -- SQLFROM #gp_dataWHERE pcn_code <>'Closed practice'GROUPBY indicator_id ,start_date ,end_date ,indicator_value ,lower_ci95 ,upper_ci95 ,imd_code ,pcn_code ,age_group_code ,sex_code ,ethnicity_code ,creation_date ,value_type_code ,source_code;
Note that the geography column now contains PCN codes.
3.4 Build ICB-level data
Use the temporary GP-level data and aggregate it to the ICB-level data by grouping on the required columns.
Then, insert the aggregated data into the staging processed indicator data table created in Step 3.1.1.
Show example
/*================================================================================================= ICB-level data=================================================================================================*/INSERTINTO #staging_QOF_data( indicator_id, start_date, end_date, numerator, denominator, indicator_value, lower_ci95, upper_ci95, imd_code, aggregation_id, age_group_code, sex_code, ethnicity_code, creation_date, value_type_code, source_code)SELECT indicator_id ,start_date ,end_date ,SUM(numerator) AS numerator ,SUM(denominator) AS denominator ,indicator_value ,lower_ci95 ,upper_ci95 ,imd_code -- All IMD ,'E38000258'AS aggregation_id -- map to reference ID later ,age_group_code -- All ages ,sex_code -- All persons ,ethnicity_code -- All ethnicities ,creation_date ,value_type_code -- Percentage ,source_code -- SQLFROM #gp_dataGROUPBY indicator_id ,start_date ,end_date ,indicator_value ,lower_ci95 ,upper_ci95 ,imd_code ,age_group_code ,sex_code ,ethnicity_code ,creation_date ,value_type_code ,source_code;
Note that the geography column now contains the BSOL ICB code (E38000258).
3.5 Update geography
Update the geography column in the staging processed indicator data table to ensure that it uses the aggregation_id from the following reference table:
Show code
SELECT TOP (1000) [aggregation_id] ,[aggregation_type] ,[aggregation_code] ,[aggregation_label]FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Geography]
Show example
/*=================================================================================================Map geography codes to aggregation_id=================================================================================================*/UPDATE T1SET T1.[aggregation_id] = T2.[aggregation_id]FROM #staging_QOF_data AS T1JOIN [EAT_Reporting_BSOL].[OF].[OF2_Reference_Geography] AS T2ON T1.[aggregation_id] = T2.[aggregation_code];
When updating the geography column, the aggregation_id can be obtained by joining data based on the aggregation type (PCN or ICB) and the aggregation code (PCN or ICB code).
3.6 Update destination table
Insert the data into the destination table (e.g., [EAT_Reporting_BSOL].[OF].[OF2_Indicator_SQL_Data]). There is no need to remove old data, as each record has a creation date, and we will select the latest data during processing.
Once you’ve built your indicator, go to the metadata file and fill in any missing fields, as well as add any caveats or notes that are not already included, so we have complete details for each metadata entry.
Metadata checklist:
Review populated fields
Make sure fields reflect our data extracted
Populate the empty fields
Review and update links (some broken because of change of URL in the website)
Review caveat for updates from fingertips for instance
Review status of indicator as some might have changed
4 Chapter 2: Building Indicators (without Pre-Defined Denominator)
The following steps will guide you through building OF indicators without pre-defined denominators - i.e., when the data source does not include a denominator column. Examples include indicators from SUS and Mortality.
In this case, the denominator will be derived using population estimates from the Census, while the numerator will come from the data source. Since the data source is patient-level, the numerator will be calculated by grouping patient-level activity.
In addition, the geography levels for these indicators are based on residence—i.e., Ward, Locality (resident), and Local Authority District (LAD)—rather than GP registration, where the levels would instead be PCN and ICB.
Firstly, create a new SQL script per indicator and name the file as follows:
ID_{indicator_id}_{indicator_short_name}
Save the script in the corresponding data source folder within the Outcome Framework Rebuild path:
This temporary table will store the numerator dataset, containing all columns required to create aggregated data at geographical levels (Ward, Locality Resident, LAD) and with demographic splits.
This table will store the processed indicator data for each required geography level and demographic split, which will subsequently be used to update the final indicator data table.
It must therefore include all columns defined in the specified data structure of the destination table (i.e., [EAT_Reporting_BSOL].[OF].[OF2_Indicator_SQL_Data]).
Create a cohort of BSOL residents by gathering all admission episodes within the specified period, filtered by patient geography within the BSOL footprint to narrow down the number of episodes.
Insert the episodes IDs into the temporary table for BSOL residents created in Step 4.1.1.
Show example
DECLARE @start_month INTDECLARE @end_month INTSET @start_month =201904SET @end_month =202508INSERTINTO #BSOL_OF_tbIndicator_PtsCohort_IP (episode_id)(SELECT T1.EpisodeIDFROM [EAT_Reporting].[dbo].[tbInpatientEpisodes] T1INNERJOIN [EAT_Reporting].[dbo].[tbIPPatientGeography] T2 ON T1.EpisodeId = T2.EpisodeIdWHERE ReconciliationPoint BETWEEN @start_month AND @end_monthAND T2.OSLAUA IN ('E08000025', 'E08000029') --Bham & Solihull LA)
4.3 Extract numerator
Check the metadata file to get the definition of the numerator and extract it accordingly.
Apply all necessary filters specified in the metadata. For example: diagnosis codes, admission methods, age at admission, diagnosis order, and order in spell.
Group the data appropriately (for example, by episode ID) to calculate the numerator.
Insert the numerator, along with the indicator ID and episode ID, into the numerator dataset (#BSOL_OF_tbStaging_NumeratorData).
If the LSOA 21 column is not available in the data source, use the reference table below to map LSOA 11 to LSOA 21.
Show code
SELECT TOP 1000*FROM [EAT_Reporting_BSOL].[Reference].[LSOA_2011_to_LSOA_2021]
4.4.7 Ward
Use the reference table below to map LSOA 21 to Ward, and update the Ward column in the numerator dataset (#BSOL_OF_tbStaging_NumeratorData) accordingly.
Show code
SELECT TOP 1000*FROM [EAT_Reporting_BSOL].[Reference].[LSOA_2021_WARD_LAD]
Similarly, use the reference table below to map LSOA 21 to LAD, and update the LAD column in the numerator dataset (#BSOL_OF_tbStaging_NumeratorData) accordingly.
Show code
SELECT TOP 1000*FROM [EAT_Reporting_BSOL].[Reference].[LSOA_2021_WARD_LAD]
Use the reference table below to map LSOA 21 to Locality (resident), and update the locality column in the numerator dataset (#BSOL_OF_tbStaging_NumeratorData) accordingly.
Show code
SELECT TOP 1000*FROM [EAT_Reporting_BSOL].[Reference].[LSOA_2021_BSOL_to_Constituency_2025_Locality]
The IMD is based on the population-weighted average IMD scores across LSOAs within each Ward.
Use the reference table below to map each Ward to its average IMD quintile, and update the IMD column in the numerator dataset (#BSOL_OF_tbStaging_NumeratorData) accordingly.
Show code
SELECT TOP 1000*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Ward_To_IMD]
Check the metadata file to identify the age group requirements for this indicator.
Also check the value type for the indicator. For example, if the value type is directly age-standardised rate, the age should be grouped into 5-year age bands. Otherwise, the age should follow the groupings specified in the metadata file (e.g., ‘0-18 yrs’, ‘65+ yrs’).
Use the reference table below to map each age to its corresponding 5-year age band.
Show code
SELECT TOP 1000*FROM [EAT_Reporting_BSOL].[Reference].[tbAge]
In this example, each age is mapped to its corresponding 5-year age band, since the indicator’s value type is directly age-standardised rate. For 5-year age bands, ensure that the maximum band is 85+ years.
Finally, use the reference table below to map each age group to its code, and update the age column in the numerator dataset (#BSOL_OF_tbStaging_NumeratorData) accordingly.
Show code
SELECT TOP 1000*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Age_Group]
Apply ‘999’ code to the sex column in the numerator dataset (#BSOL_OF_tbStaging_NumeratorData), which corresponds to ‘All (Persons)’.
Show example
UPDATE T1SET T1.[sex] ='999'-- All (persons)FROM #BSOL_OF_tbStaging_NumeratorData T1
4.5.3 Ethnicity
Use the reference table below to map each ethnicity to its code, and update the ethnicity_code column in the numerator dataset (#BSOL_OF_tbStaging_NumeratorData) accordingly.
Use the number column prefixed ethnicity_code relevant to your ethnicity breakdown.
For example:
Ethnicity breakdowns by NHS Code and Census ethnic group must use the ethnicity_code column.
Ethnicity breakdowns for the main 5 groupings and CVD prevent must use the ethnicity_code_main column.
Ethnicity breakdowns for OF groupings must use the ethnicity_code_OF column.
Show code
SELECT TOP 1000*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Ethnicity]
Use the reference table below to map each IMD to its code, and update the IMD column in the numerator dataset (#BSOL_OF_tbStaging_NumeratorData) accordingly.
Show code
SELECT TOP 1000*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_IMD]
Use the reference table below to map each Ward, LAD, and Locality (resident) to its code, and update the respective columns in the numerator dataset (#BSOL_OF_tbStaging_NumeratorData) accordingly.
Show code
SELECT TOP 1000*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Geography]
The following steps will guide you through aggregating the numerator dataset (#BSOL_OF_tbStaging_NumeratorData) to geography levels (Ward, Locality resident, LAD) with demographic splits.
The aggregated dataset should contain the following columns:
Indicator ID column: indicator_id
Start and end dates of the financial year: start_date, end_date
Creation date column, which is the current date: creation_date
Value type column, as specified in the metadata file: value_type_code
Source column specifying where the indicator is derived from, as specified in the metadata file: source_code
4.6.1 Ward geography
Show example
SELECT T1.[indicator_id], CAST(LEFT(T1.[financial_year], 4) +'-04-01'ASDATE) AS start_date, CAST('20'+RIGHT(T1.[financial_year], 2) +'-03-31'ASDATE) AS end_date, SUM(T1.[numerator]) AS numerator, CAST(NULLASINT) AS denominator, CAST(NULLASNUMERIC) AS indicator_value, CAST(NULLASNUMERIC) AS lower_ci95, CAST(NULLASNUMERIC) AS upper_ci95, T1.[imd] AS imd_code, T1.[ward_code] AS aggregation_id, T1.[age] AS age_group_code, T1.[sex] AS sex_code, T1.[ethnicity_code] , CAST(CURRENT_TIMESTAMPASDATE) AS creation_date, '4'AS value_type_code --DASR, '1'AS source_code -- SQLFROM #BSOL_OF_tbStaging_NumeratorData T1GROUPBY T1.[indicator_id], CAST(LEFT(T1.[financial_year], 4) +'-04-01'ASDATE), CAST('20'+RIGHT(T1.[financial_year], 2) +'-03-31'ASDATE), T1.[imd], T1.[ward_code] , T1.[sex], T1.[age], T1.[ethnicity_code]
4.6.2 LAD geography
Show example
SELECT T1.[indicator_id], CAST(LEFT(T1.financial_year, 4) +'-04-01'ASDATE) AS start_date, CAST('20'+RIGHT(T1.financial_year, 2) +'-03-31'ASDATE) AS end_date, SUM(T1.numerator) AS numerator, CAST(NULLASINT) AS denominator, CAST(NULLASNUMERIC) AS indicator_value, CAST(NULLASNUMERIC) AS lower_ci95, CAST(NULLASNUMERIC) AS upper_ci95, T1.[imd] AS imd_code, T1.[lad_code] AS aggregation_id, T1.[age] AS gae_group_code, t1.[sex] AS sex_code , T1.[ethnicity_code] AS ethnicity_code, CAST(CURRENT_TIMESTAMPASdate) AS creation_date, '4'AS value_type_code --DASR, '1'AS source_codeFROM #BSOL_OF_tbStaging_NumeratorData T1GROUPBY T1.[indicator_id], CAST(LEFT(T1.[financial_year], 4) +'-04-01'ASDATE), CAST('20'+RIGHT(T1.[financial_year], 2) +'-03-31'ASDATE), T1.[imd], T1.[lad_code], T1.[age], T1.[sex], T1.[ethnicity_code]
4.6.3 Locality (resident) geography
Show example
SELECT T1.[indicator_id], CAST(LEFT(T1.[financial_year], 4) +'-04-01'ASDATE) AS start_date, CAST('20'+RIGHT(T1.[financial_year], 2) +'-03-31'ASDATE) AS end_date, SUM(T1.[numerator]) AS numerator, CAST(NULLASINT) AS denominator, CAST(NULLASNUMERIC) AS indicator_value, CAST(NULLASNUMERIC) AS lower_ci95, CAST(NULLASNUMERIC) AS upper_ci95, T1.[imd] AS imd_code, T1.[locality_res] AS aggregation_id, T1.[age] AS age_group_code, T1.[sex] AS sex_code , T1.[ethnicity_code], CAST(CURRENT_TIMESTAMPASDATE) AS creation_date, '4'AS value_type_code --DASR, '1'AS source_code -- SQLFROM #BSOL_OF_tbStaging_NumeratorData T1WHERE [locality_res] <>'Non-bsol'GROUPBY T1.[indicator_id], CAST(LEFT(T1.[financial_year], 4) +'-04-01'ASDATE), CAST('20'+RIGHT(T1.[financial_year], 2) +'-03-31'ASDATE), T1.[imd], T1.[locality_res], T1.[age], T1.[sex], T1.[ethnicity_code]
4.7 Insert data into staging table
You can union those three datasets for Ward, Locality (resident), and LAD geographies created in the previous steps, and insert them into the staging table for the processed indicator data created in Step 4.1.3.
For the Locality geography, ensure that ‘Non-BSOL’ locality is removed from the dataset.
Assign the current date to the indicator creation_date.
Additionally, assign the value type and source from the metadata file to the value_type and source column, respectively.
Show example
INSERTINTO #BSOL_OF_tbStaging_SUS_Data ( [indicator_id], [start_date], [end_date], [numerator], [denominator], [indicator_value], [lower_ci95], [upper_ci95], [imd_code], [aggregation_id], [age_group_code], [sex_code], [ethnicity_code], [creation_date], [value_type_code], [source_code])-- Ward Geography(SELECT T1.[indicator_id], CAST(LEFT(T1.[financial_year], 4) +'-04-01'ASDATE) AS start_date, CAST('20'+RIGHT(T1.[financial_year], 2) +'-03-31'ASDATE) AS end_date, SUM(T1.[numerator]) AS numerator, CAST(NULLASINT) AS denominator, CAST(NULLASNUMERIC) AS indicator_value, CAST(NULLASNUMERIC) AS lower_ci95, CAST(NULLASNUMERIC) AS upper_ci95, T1.[imd] AS imd_code, T1.[ward_code] AS aggregation_id, T1.[age] AS age_group_code, T1.[sex] AS sex_code, T1.[ethnicity_code] , CAST(CURRENT_TIMESTAMPASDATE) AS creation_date, '4'AS value_type_code --DASR, '1'AS source_code -- SQLFROM #BSOL_OF_tbStaging_NumeratorData T1GROUPBY T1.[indicator_id], CAST(LEFT(T1.[financial_year], 4) +'-04-01'ASDATE), CAST('20'+RIGHT(T1.[financial_year], 2) +'-03-31'ASDATE), T1.[imd], T1.[ward_code] , T1.[sex], T1.[age], T1.[ethnicity_code] UNION-- LAD GeographySELECT T1.[indicator_id], CAST(LEFT(T1.financial_year, 4) +'-04-01'ASDATE) AS start_date, CAST('20'+RIGHT(T1.financial_year, 2) +'-03-31'ASDATE) AS end_date, SUM(T1.numerator) AS numerator, CAST(NULLASINT) AS denominator, CAST(NULLASNUMERIC) AS indicator_value, CAST(NULLASNUMERIC) AS lower_ci95, CAST(NULLASNUMERIC) AS upper_ci95, T1.[imd] AS imd_code, T1.[lad_code] AS aggregation_id, T1.[age] AS gae_group_code, t1.[sex] AS sex_code , T1.[ethnicity_code] AS ethnicity_code, CAST(CURRENT_TIMESTAMPASdate) AS creation_date, '4'AS value_type_code --DASR, '1'AS source_codeFROM #BSOL_OF_tbStaging_NumeratorData T1GROUPBY T1.[indicator_id], CAST(LEFT(T1.[financial_year], 4) +'-04-01'ASDATE), CAST('20'+RIGHT(T1.[financial_year], 2) +'-03-31'ASDATE), T1.[imd], T1.[lad_code], T1.[age], T1.[sex], T1.[ethnicity_code] UNION--Locality GeographySELECT T1.[indicator_id], CAST(LEFT(T1.[financial_year], 4) +'-04-01'ASDATE) AS start_date, CAST('20'+RIGHT(T1.[financial_year], 2) +'-03-31'ASDATE) AS end_date, SUM(T1.[numerator]) AS numerator, CAST(NULLASINT) AS denominator, CAST(NULLASNUMERIC) AS indicator_value, CAST(NULLASNUMERIC) AS lower_ci95, CAST(NULLASNUMERIC) AS upper_ci95, T1.[imd] AS imd_code, T1.[locality_res] AS aggregation_id, T1.[age] AS age_group_code, T1.[sex] AS sex_code , T1.[ethnicity_code], CAST(CURRENT_TIMESTAMPASDATE) AS creation_date, '4'AS value_type_code --DASR, '1'AS source_code -- SQLFROM #BSOL_OF_tbStaging_NumeratorData T1WHERE [locality_res] <>'Non-bsol'GROUPBY T1.[indicator_id], CAST(LEFT(T1.[financial_year], 4) +'-04-01'ASDATE), CAST('20'+RIGHT(T1.[financial_year], 2) +'-03-31'ASDATE), T1.[imd], T1.[locality_res], T1.[age], T1.[sex], T1.[ethnicity_code] )
4.8 Create dataset with IMD collapsed to ‘All’
This step enables calculation of rates or other value types by Ethnicity only.
Get all the columns from the staging processed indicator data and assign ‘999’ code to the imd_code column, which corresponds to ‘All (Persons)’ category.
Save the data in a temporary table so that it can later be joined together with the other processed indicator data before inserting them into the destination table.
4.9 Create dataset with Ethnicity collapsed to ‘All’
This step enables calculation of rates or other value types by IMD only.
Get all the columns from the staging processed indicator data and assign ‘999’ code to the ethnicity_code column, which corresponds to ‘All (Persons)’ category.
Save the data in a temporary table so that it can later be joined together with the other processed indicator data before inserting them into the destination table.
Union all datasets (i.e., from tables created in Step 4.7, Step 4.8, and Step 4.9) and insert them into the destination table (e.g., [EAT_Reporting_BSOL].[OF].[OF2_Indicator_SQL_Data]).
There is no need to remove old data, as each record has a creation date, and we will select the latest data during processing.
Show example
INSERTINTO [EAT_Reporting_BSOL].[OF].[OF2_Indicator_SQL_Data]SELECT*FROM (SELECT*FROM #BSOL_OF_tbStaging_SUS_DataUNIONSELECT*FROM #temp1UNIONSELECT*FROM #temp2) AS final
4.11 Update metadata
Once you’ve built your indicator, go to the metadata file and fill in any missing fields, as well as add any caveats or notes that are not already included, so we have complete details for each metadata entry.
Metadata checklist:
Review populated fields
Make sure fields reflect our data extracted
Populate the empty fields
Review and update links (some broken because of change of URL in the website)
Review caveat for updates from fingertips for instance
Review status of indicator as some might have changed
5 Chapter 3: Building Indicators with Derived Denominator
The following steps will guide you through building OF indicators, where both the numerator and denominator are derived from their respective data sources. Unlike the first two chapters, this chapter focuses on indicators that do not have an existing denominator column in their data sources and do not rely on population estimates from the Census to derive their denominators.
An example of such an indicator is the National Diabetes Audit (NDA), where the data is at the patient level and contains GP practice information. Therefore, we will aggregate the dataset into PCN and ICB geographical hierarchies with available demographic splits.
First, create a new SQL script for each indicator and name the file as follows:
ID_{indicator_id}_{indicator_short_name}.sql
Save the script in the corresponding data source folder within the Outcome Framework Rebuild path:
We begin by creating a staging table to store the processed indicator data. This table should contain all required columns, matching those in the final OF dataset ([EAT_Reporting_BSOL].[OF].[OF2_Indicator_SQL_Data]), so that it can be used to update the final dataset.
We will now create a dataset to store patient-level numerator data. Check the metadata file to understand the definition of numerator for your indicator, including any conditions or filters required.
This dataset should have the required columns to enable the aggregation of numerators with geographic and demographic splits. These include indicator id, financial year, IMD quintile, age, sex, ethnicity, GP practice code, and Pseudo NHS number.
Apply all necessary filters such as time periods and specific conditions to obtain the patient cohort as specified in the metadata file. In addition, you can use the following reference table to select only BSOL practices.
SELECT TOP 1000*FROM EAT_Reporting_BSOL.Reference.BSOL_ICS_PracticeMapped
Show example
/*=================================================================================STEP 2: Create patient-level numerator datasetNumerator: Patients at BSOL GP Practices with Type 1 Diabetes who received all 8 care processes=================================================================================*/DROPTABLEIFEXISTS #numerator_data;SELECTDISTINCT35AS indicator_id, T1.AUDIT_YEAR AS financial_year,1AS numerator, T1.IMD_QUINTILE AS imd_quintile, T1.AGE AS age_group, T1.CLEAN_SEX AS sex_code, T1.CLEAN_ETHNICITY AS ethnicity_code, T1.CURRENT_GP AS gp_practice, T1.PatientId AS pseudo_nhs_numberINTO #numerator_dataFROM [LocalFeeds].[Reporting].[NationalDiabetesAudit_NDA_Core_Data] AS T1INNERJOIN EAT_Reporting_BSOL.Reference.BSOL_ICS_PracticeMapped AS T2ON T1.CURRENT_GP = T2.GPPracticeCode_OriginalWHERE T2.ICS_2223 ='BSOL'AND T1.CLEAN_DIABETES_TYPE IN ('1','01') -- Type 1 DiabetesAND T1.ALL_8_CARE_PROCESSES =1-- Received all 8 care processesAND AUDIT_YEAR IN ('201415', '201516', '201617', '201718', '201819', '201920', '202021', '202122E4', '202223', '202324E3');
5.2.1 Update ethnicity from local demographic table
Update the ethnicity code in your patient-level numerator data using the local demographic table below when the pseudo NHS number matches. Otherwise, retain the ethnicity code from the data source.
SELECT TOP 10000*FROM EAT_Reporting_BSOL.Demographic.Ethnicity
Show example
/*=================================================================================STEP 3: Update numerator ethnicity code from Local Demographics=================================================================================*/UPDATE T1SET T1.ethnicity_code = T2.Ethnic_CodeFROM #numerator_data AS T1INNERJOIN EAT_Reporting_BSOL.Demographic.Ethnicity AS T2ON T1.pseudo_nhs_number = T2.Pseudo_NHS_Number;
5.2.2 Aggregate numerator
Aggregate the numerator dataset to obtain total numerators across the extracted columns. At this point, you will have a GP-level aggregated dataset (no longer at patient-level), which will later be combined with the denominator dataset before finalising the destination table.
Next, create a dataset to store patient-level denominator data. Refer to the metadata file to understand the definition of denominator for your indicator, including any conditions or filters required.
This dataset include the necessary columns to allow aggregation of denominators by geographic and demographic splits. These should include: indicator id, financial year, IMD quintile, age, sex, ethnicity, GP practice code, and Pseudo NHS number. The columns in the denominator dataset must match those in the numerator dataset created in Step 5.2.
Apply all required filters, such as time periods and specific conditions, to obtain the population defined in the metadata file. In addition, you can use the following reference table to select only BSOL practices.
SELECT TOP 1000*FROM EAT_Reporting_BSOL.Reference.BSOL_ICS_PracticeMapped
Show example
/*=================================================================================STEP 5: Create patient-level denominator datasetDenominator: Patients at BSOL GP Practices with Type 1 Diabetes=================================================================================*/DROPTABLEIFEXISTS #denominator_data;SELECTDISTINCT35AS indicator_id, T1.AUDIT_YEAR AS financial_year,1AS denominator, T1.IMD_QUINTILE AS imd_quintile, T1.AGE AS age_group, T1.CLEAN_SEX AS sex_code, T1.CLEAN_ETHNICITY AS ethnicity_code, T1.CURRENT_GP AS gp_practice, T1.PatientId AS pseudo_nhs_numberINTO #denominator_dataFROM [LocalFeeds].[Reporting].[NationalDiabetesAudit_NDA_Core_Data] AS T1INNERJOIN EAT_Reporting_BSOL.Reference.BSOL_ICS_PracticeMapped AS T2ON T1.CURRENT_GP = T2.GPPracticeCode_OriginalWHERE T2.ICS_2223 ='BSOL'AND T1.CLEAN_DIABETES_TYPE IN ('1','01') AND AUDIT_YEAR IN ('201415', '201516', '201617', '201718', '201819', '201920', '202021', '202122E4', '202223', '202324E3')
5.3.1 Update ethnicity from local demography table
Update the ethnicity code in your patient-level denominator data using the local demographic table below when the pseudo NHS number matches. Otherwise, retain the ethnicity code from the data source.
SELECT TOP 10000*FROM EAT_Reporting_BSOL.Demographic.Ethnicity
Show example
/*=================================================================================STEP 6: Update denominator ethnicity code from Local Demographics=================================================================================*/UPDATE T1SET T1.ethnicity_code = T2.Ethnic_CodeFROM #denominator_data AS T1INNERJOIN EAT_Reporting_BSOL.Demographic.Ethnicity AS T2ON T1.pseudo_nhs_number = T2.Pseudo_NHS_Number;
5.3.2 Aggregate denominator
Aggregate the denominator dataset to obtain total denominators across the extracted columns. At this point, you will have a GP-level aggregated dataset (no longer at patient-level), which will later be combined with the numerator dataset before finalising the destination table.
Once you have prepared both the aggregated numerator and denominator datasets, combine them into a single dataset by bringing the numerator into the denominator (e.g., using a LEFT JOIN from the denominator). This ensures every eligible population group is retained and that groups with no events correctly appear with a numerator of zero. Joining the other way around would restrict the data to groups with events only, which could result in numerator and denominator counts appear identical.
Standardise the financial year to the format YYYY-YY and remove any suffixes such as E3 or E4. This will make it easier to obtain the start and end dates of the financial year.
Show example
/*=================================================================================STEP 9: Standardise/Map dimension codes on the combined dataset=================================================================================*/-- 9.1 Standardise financial year to 'YYYY-YY' (removes any suffix like E3/E4)UPDATE T1SET T1.financial_year =LEFT(T1.financial_year, 4) +'-'+ SUBSTRING(T1.financial_year, 5, 2)FROM #combined_dataset T1;
5.5.2 Age
Refer to the metadata file to identify the age group for your indicator.
Then, map this age group to its corresponding code using the reference table below:
For example, if the sex category is 'All (Persons)', then the correct code is '999'.
Show example
-- 9.3 Set Sex to "All" categories as per metadataUPDATE T1 SET T1.sex_code ='999'FROM #combined_dataset AS T1; -- All persons
5.5.4 Ethnicity
If your dataset has ethnicity splits, map each ethnicity to its corresponding code using the reference table below. Use the number column prefixed ethnicity_code relevant to your ethnicity breakdown.
For example:
Ethnicity breakdowns by NHS Code and Census ethnic group must use the ethnicity_code column.
Ethnicity breakdowns for the main 5 groupings and CVD prevent must use the ethnicity_code_main column.
Ethnicity breakdowns for OF groupings must use the ethnicity_code_OF column.
Otherwise, assign '999' code to the IMD column, which corresponds to ’All (Persons)' category.
Additionally, ensure that any missing quintiles are replaced with the code '-99', which corresponds to the 'Unknown' category.
Show example
-- 9.5 Map IMD quintile to OF code; default to '-99' when missing/NULLUPDATE T1SET T1.imd_quintile = T2.imd_codeFROM #combined_dataset T1INNERJOIN [EAT_Reporting_BSOL].[OF].[OF2_Reference_IMD] T2ON T1.imd_quintile = T2.imd_quintileUPDATE T1SET T1.imd_quintile ='-99'FROM #combined_dataset T1WHERE T1.imd_quintile =''OR T1.imd_quintile ISNULL
5.6 Build GP-level data
You now have a combined dataset of numerators and denominators, along with demographic splits at the GP level. Next, enrich this dataset with additional columns to match those in the final dataset, as well as GP and PCN codes, to enable aggregation at higher levels.
These include:
start_date and end_date of the financial year;
numerator and denominator;
empty columns for indicator_value, lower_ci95, and upper_ci_95;
demographic splits such as IMD_code, sex_code, age_group_code, and ethnicity_code;
metadata such as creation_date, value_type_code, and source_code
You may use the reference table below to get the start and end dates of the financial year:
SELECT TOP 1000*FROM [Reference].[dbo].[DIM_tbDate]
Ensure that you select only BSOL practices and exclude the following:
GP practice = Cape Hill Medical Centre (M88006)
PCN = Closed practice
Show example
/*=================================================================================STEP 10: Create GP-level dataset with additional columns=================================================================================*/DROPTABLEIFEXISTS #gp_data;SELECT1AS indicator_id ,CAST(LEFT(T1.financial_year, 4) +'-04-01'ASDATE) AS start_date ,CAST('20'+RIGHT(T1.financial_year, 2) +'-03-31'ASDATE) AS end_date ,T1.numerator ,T1.denominator ,CAST(NULLASFLOAT) AS indicator_value ,CAST(NULLASFLOAT) AS lower_ci95 ,CAST(NULLASFLOAT) AS upper_ci95 ,imd_quintile AS imd_code -- All IMD ,T1.gp_practice AS gp_practice_code -- GP ,T2.[PCN code] AS pcn_code -- PCN ,T1.age_group AS age_code -- All Ages ,T1.sex_code -- All Persons ,T1.ethnicity_code ,CAST(GETDATE() ASDATE) AS creation_date ,2AS [value_type_code] -- Percentage ,1AS [source_code] -- SQLINTO #gp_dataFROM #combined_dataset AS T1INNERJOIN EAT_Reporting_BSOL.[Reference].[BSOL_ICS_PracticeMapped] AS T2ON T1.gp_practice = T2.GPPracticeCode_OriginalWHERE T2.ICS_2223 ='BSOL'AND T1.gp_practice <>'M88006'AND T2.[PCN code] <>'closed practice';
5.7 Build PCN-level data
Next, create a PCN-level dataset by using the GP-level dataset created in Step 5.6 and grouping the data by the required columns.
Insert this dataset into the staging table created in Step 5.1.
Notice that the pcn_code column serves the aggregation level for this dataset.This column will later be updated to use the aggregation IDs from the OF geography reference table instead of the codes.
Show example
/*=======================================================================================================STEP 11: Create PCN-level dataset=======================================================================================================*/INSERTINTO #staging_NDA_data ( [indicator_id] ,[start_date] ,[end_date] ,[numerator] ,[denominator] ,[indicator_value] ,[lower_ci95] ,[upper_ci95] ,[imd_code] ,[aggregation_id] ,[age_group_code] ,[sex_code] ,[ethnicity_code] ,[creation_date] ,[value_type_code] ,[source_code] )SELECT [indicator_id] ,[start_date] ,[end_date] ,sum(numerator) as numerator ,sum(denominator) as denominator ,[indicator_value] ,[lower_ci95] ,[upper_ci95] ,[imd_code] -- All IMD ,[pcn_code] -- Will map it to aggregation id later ,[age_code] -- All Ages ,[sex_code] -- All Persons ,[ethnicity_code] ,[creation_date] ,[value_type_code] -- Percentage ,[source_code] -- SQLFROM #gp_dataGROUPBY [indicator_id] ,[start_date] ,[end_date] ,[indicator_value] ,[lower_ci95] ,[upper_ci95] ,[imd_code] ,[pcn_code] ,[age_code] ,[sex_code] ,[ethnicity_code] ,[creation_date] ,[value_type_code] ,[source_code]
5.8 Build ICB-level data
Next, create an ICB-level dataset by using the GP-level dataset created in Step 5.6 and grouping the data by the required columns.
Assign E38000258 to the aggregation_id column, as this serves as the aggregation level for this dataset. This column will later be updated to use the aggregation IDs from the OF geography reference table instead of the codes.
Insert this dataset into the staging table created in Step 5.1.
Show example
/*=======================================================================================================STEP 12: Create ICB-level dataset=======================================================================================================*/INSERTINTO #staging_NDA_data ( [indicator_id] ,[start_date] ,[end_date] ,[numerator] ,[denominator] ,[indicator_value] ,[lower_ci95] ,[upper_ci95] ,[imd_code] ,[aggregation_id] ,[age_group_code] ,[sex_code] ,[ethnicity_code] ,[creation_date] ,[value_type_code] ,[source_code] )SELECT [indicator_id] ,[start_date] ,[end_date] ,sum(numerator) as numerator ,sum(denominator) as denominator ,[indicator_value] ,[lower_ci95] ,[upper_ci95] ,[imd_code] -- All IMD ,'E38000258'AS icb_code -- Will map it to agrgegation id later ,[age_code] -- All Ages ,[sex_code] -- All Persons sex ,[ethnicity_code] ,[creation_date] ,[value_type_code] --Percentage ,[source_code] --SQLFROM #gp_dataGROUPBY [indicator_id] ,[start_date] ,[end_date] ,[indicator_value] ,[lower_ci95] ,[upper_ci95] ,[imd_code] ,[age_code] ,[sex_code] ,[ethnicity_code] ,[creation_date] ,[value_type_code] ,[source_code]
5.9 Update geography
Map the PCN codes and the ICB code to their corresponding aggregation IDs using the reference table below:
This ensures that the processed indicator data uses standardised codes and a consistent data structure.
Show example
/*=======================================================================================================STEP 13: Map geography codes to aggregation_id=======================================================================================================*/UPDATE T1SET T1.[aggregation_id] = T2.[aggregation_id]FROM #staging_NDA_data AS T1JOIN [EAT_Reporting_BSOL].[OF].[OF2_Reference_Geography] AS T2ON T1.[aggregation_id] = T2.[aggregation_code];
5.10 Insert data into destination table
Insert the data into the destination table (e.g., [EAT_Reporting_BSOL].[OF].[OF2_Indicator_SQL_Data]). There is no need to remove old data, as each record has a creation date, and we will select the latest data during processing.
Once you’ve built your indicator, go to the metadata file and fill in any missing fields, as well as add any caveats or notes that are not already included, so we have complete details for each metadata entry.
Metadata checklist:
Review populated fields
Make sure fields reflect our data extracted
Populate the empty fields
Review and update links (some broken because of change of URL in the website)
Review caveat for updates from fingertips for instance
Review status of indicator as some might have changed
-- Ethnicity lookup tableSELECT*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Ethnicity]-- IMD lookup tableSELECT*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_IMD]-- Age group lookup tableSELECT*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Age_Group]-- Sex lookup tableSELECT*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Sex]-- Geography lookup tableSELECT*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Geography]-- Source lookup tableSELECT*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Source]-- Value type lookup tableSELECT*FROM [EAT_Reporting_BSOL].[OF].[OF2_Reference_Value_Type]
6.3 Destination tables
Processed indicator data from warehouse
Show code
-- destination table for the processed indicator data extracted from warehouseSELECT*FROM [EAT_Reporting_BSOL].[OF].[OF2_Indicator_SQL_Data]
Processed indicator data from API
Show code
-- destination table for the processed indicator data extracted from APISELECT*FROM [EAT_Reporting_BSOL].[OF].[OF2_Indicator_API_Data]
Processed indicator data from SharePoint
Show code
-- destination table for the processed indicator data extracted from SharepointSELECT*FROM [EAT_Reporting_BSOL].[OF].[OF2_Indicator_Sharepoint_Data]
Processed indicator data from other sources
Show code
-- destination table for the processed indicator data extracted from other sourcesSELECT*FROM [EAT_Reporting_BSOL].[OF].[OF2_Indicator_Other_Data]
7 Limitations
7.1 IMD quintile
When creating datasets for both numerator and denominator for crude and age-standardised rate indicators, the IMD quintile column was derived from the ward-level IMD score. This ward-level score was calculated as the population-weighted average of the IMD scores of all LSOAs within the ward, with each LSOA’s score weighted by its population. As a result, each ward was assigned only one IMD quintile.
We used this package to get a dataset of population-weighted average IMD scores for wards in England humaniverse/IMD, README.
This approach was carried forward when aggregating to higher geographies. For example, the IMD quintiles for a locality were based on the ward-level IMD scores of all wards within that locality, which themselves were population-weighted averages of LSOA scores. This means the locality-level quintiles are indirectly based on aggregated values rather than directly on individual LSOA scores, which may smooth out variation within wards.
Because IMD quintiles at ward level are based on population-weighted average scores, they reflect the overall deprivation for the majority of residents in that ward. This approach can mask pockets of high deprivation within otherwise less deprived wards. For example, if most of the ward’s population lives in low-deprivation areas, small neighbourhood (LSOAs) with high deprivation will have little influence on the ward’s average score. As a result, those deprived pockets may not be visible in ward-level quintiles, and interventions targeted only at “most deprived wards” may overlook these communities.